10 research outputs found

    Adaptive value function approximation in reinforcement learning using wavelets

    Get PDF
    A thesis submitted to the Faculty of Science, School of Computational and Applied Mathematics University of the Witwatersrand, Johannesburg, in fulfilment of the requirements for the degree of Doctor of Philosophy. Johannesburg, South Africa, July 2015.Reinforcement learning agents solve tasks by finding policies that maximise their reward over time. The policy can be found from the value function, which represents the value of each state-action pair. In continuous state spaces, the value function must be approximated. Often, this is done using a fixed linear combination of functions across all dimensions. We introduce and demonstrate the wavelet basis for reinforcement learning, a basis function scheme competitive against state of the art fixed bases. We extend two online adaptive tiling schemes to wavelet functions and show their performance improvement across standard domains. Finally we introduce the Multiscale Adaptive Wavelet Basis (MAWB), a wavelet-based adaptive basis scheme which is dimensionally scalable and insensitive to the initial level of detail. This scheme adaptively grows the basis function set by combining across dimensions, or splitting within a dimension those candidate functions which have a high estimated projection onto the Bellman error. A number of novel measures are used to find this estimate.

    An investigation of data compression techniques for hyperspectral core imager data

    Get PDF
    We investigate algorithms for tractable analysis of real hyperspectral image data from core samples provided by AngloGold Ashanti. In particular, we investigate feature extraction, non-linear dimension reduction using diffusion maps and wavelet approximation methods on our data

    An investigation into the detection of seafloor massive sulphides through sonar

    Get PDF
    M.Sc., Faculty of Science, University of the Witwatersrand, 2011Sea oor massive sulphides are deep sea mineral deposits currently being examined as a potential mining resource. Locating these deposits, which occur at depths in the order of 2km, is currently performed by expensive submersible sonar platforms as conventional sonar bathymetry products gathered by sea surface platforms do not achieve adequate spatial resolution. This document examines the use of so-called high resolution beamforming methods (such as MUSIC and ESPRIT) for sonar bathymetry, together with combinations of parameter estimation techniques, including techniques for full rank covariance matrix estimation and signal enumeration. These methods are tested for bathymetric pro le accuracy using simulated data, and compared to conventional bathymetric methods. It was found that high resolution methods achieved greater bathymetric accuracy and higher resolution than conventional beamforming. These methods were also robust in the presence of unwanted persistent signals and low signal to noise ratios

    Upper Bounds on the Performance of Discretisation in Reinforcement Learning

    No full text
    Reinforcement learning is a machine learning framework whereby an agent learns to perform a task by maximising its total reward received for selecting actions in each state. The policy mapping states to actions that the agent learns is either represented explicitly, or implicitly through a value function. It is common in reinforcement learning to discretise a continuous state space using tile coding or binary features. We prove an upper bound on the performance of discretisation for direct policy representation or value function approximation

    Object-oriented methods for habitat mapping at multiple scales - Case studies from Northern Germany and Wye Downs

    No full text
    This paper presents an application of object-oriented techniques for habitat classification based on remotely sensed images and ancillary data. The study reports the results of habitat mapping at multiple scales using Earth Observation (EO) data at various spatial resolutions and multi temporal acquisition dates. We investigate the role of object texture and context in classification as well as the value of integrating knowledge from ancillary data sources. Habitat maps were produced at regional and local scales in two case studies; Schleswig-Holstein, Germany and Wye Downs, United Kingdom. At the regional scale, the main task was the development of a consistent object-oriented classification scheme that is transferable to satellite images for other years. This is demonstrated for a time series of Landsat TM/ETM+ scenes. At the local scale, investigations focus on the development of appropriate object-oriented rule networks for the detailed mapping of habitats, e.g. dry grasslands and wetlands using very high resolution satellite and airborne scanner images. The results are evaluated using statistical accuracy assessment and visual comparison with traditional field-based habitat maps. Whereas the application of traditional pixel-based classification result in a pixelised (salt and pepper) representation of land cover, the object-based classification technique result in solid habitat objects allowing easy integration into a vector-GIS for further analysis. The level of detail obtained at the local scale is comparable to that achieved by visual interpretation of aerial photographs or field-based mapping and also retains spatially explicit, fine scale information such as scrub encroachment or ecotone patterns within habitats

    Assessing the value of imperfect biocontainment nationally:Rapeseed in the United Kingdom as an exemplar

    No full text
    Paternal biocontainment methods (PBMs) act by preventing pollen-mediated transgene flow. They are compromised by transgene escape via the crop-maternal line. We therefore assess the efficacy of PBMs for transgenic rapeseed (Brassica napus) biocontainment across the United Kingdom by estimating crop-maternal hybridization with its two progenitor species. We used remote sensing, field surveys, agricultural statistics, and meta-analysis to determine the extent of sympatry between the crop and populations of riparian and weedy B. rapa and B. oleracea. We then estimated the incidence of crop-maternal hybridization across all settings to predict the efficacy of PBMs. Evidence of crop chloroplast capture by the progenitors was expanded to a national scale, revealing that crop-maternal gene flow occurs at widely variable rates and is dependent on both the recipient and setting. We use these data to explore the value that this kind of biocontainment can bring to genetic modification (GM) risk management in terms of reducing the impact that hybrids have on the environment rather than preventing or reducing hybrid abundance per se
    corecore